792. Number of Matching Subsequences
Medium

Given a string s and an array of strings words, return the number of words[i] that is a subsequence of s.

A subsequence of a string is a new string generated from the original string with some characters (can be none) deleted without changing the relative order of the remaining characters.

  • For example, "ace" is a subsequence of "abcde".

 

Example 1:

Input: s = "abcde", words = ["a","bb","acd","ace"]
Output: 3
Explanation: There are three strings in words that are a subsequence of s: "a", "acd", "ace".

Example 2:

Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]
Output: 2

 

Constraints:

  • 1 <= s.length <= 5 * 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 50
  • s and words[i] consist of only lowercase English letters.
Accepted
78,771
Submissions
161,612
Seen this question in a real interview before?
Companies
Related Topics

Average Rating: 4.47 (45 votes)

Premium

Approach #1: Brute Force [Time Limit Exceeded]

Intuition and Algorithm

Let's try to check separately whether each word words[i] is a subsequence of S.

For every such word, we try to find the first occurrence of word[0] in S, then from that point on try to find the first occurrence of word[1], and so on.

Complexity Analysis

  • Time Complexity: O(words.lengthS.length+iwords[i].length)O(\text{words.length} * S\text{.length} + \sum_i \text{words[i].length}). For each word, our subseq check on word words[i] may take time complexity O(S.length+words[i].length)O(S\text{.length} + \text{words[i].length}).

  • Space Complexity: O(1)O(1). (In Java, our space complexity is O(S.length+max(words[i].length))O(S\text{.length} + \text{max(words[i].length)}), but can be made to be O(1)O(1) with a pointer based implementation.)


Approach #2: Next Letter Pointers [Accepted]

Intuition

Since the length of S is large, let's think about ways to iterate through S only once, instead of many times as in the brute force solution.

We can put words into buckets by starting character. If for example we have words = ['dog', 'cat', 'cop'], then we can group them 'c' : ('cat', 'cop'), 'd' : ('dog',). This groups words by what letter they are currently waiting for. Then, while iterating through letters of S, we will move our words through different buckets.

For example, if we have a string like S = 'dcaog':

  • heads = 'c' : ('cat', 'cop'), 'd' : ('dog',) at beginning;
  • heads = 'c' : ('cat', 'cop'), 'o' : ('og',) after S[0] = 'd';
  • heads = 'a' : ('at',), 'o' : ('og', 'op') after S[0] = 'c';
  • heads = 'o' : ('og', 'op'), 't': ('t',) after S[0] = 'a';
  • heads = 'g' : ('g',), 'p': ('p',), 't': ('t',) after S[0] = 'o';
  • heads = 'p': ('p',), 't': ('t',) after S[0] = 'g';

Algorithm

Instead of a dictionary, we'll use an array heads of length 26, one entry for each letter of the alphabet. For each letter in S, we'll take all the words waiting for that letter, and have them wait for the next letter in that word. If we use the last letter of some word, it adds 1 to the answer.

For another description of this algorithm and a more concise implementation, please see @StefanPochmann's excellent forum post here.

Complexity Analysis

  • Time Complexity: O(S.length+iwords[i].length)O(S\text{.length} + \sum_i \text{words[i].length}).

  • Space Complexity: O(words.length)O(\text{words.length}). (In Java, our additional space complexity is O(iwords[i].length)O(\sum_i \text{words[i].length}), but can be made to be O(words.length)O(\text{words.length}) with a pointer based implementation.)

Report Article Issue

Comments: 32
ywen1995's avatar
Read More

I really hope I could come up with such an answer in real interviews. gj!

54
Show 1 reply
Reply
Share
Report
mazytes's avatar
Read More

[Next Letter Pointers] is really a nice solution.
Based on the shown implementation, the first letter is already removed from dict since not needed. Therefore, I feel the explanation should be adjusted as follows:
// at beginning, heads = 'c' : ('at', 'op'), 'd' : ('og')
// after S[0] = 'd', heads = 'c' : ('at', 'op'), 'o' : ('g')
// after S[1] = 'c', heads = 'a' : ('t'), 'o' : ('g', 'p')
// after S[2] = 'a', heads = 'o' : ('g', 'p'), 't': ('')
// after S[3] = 'o', heads = 'g' : (''), 'p': (''), 't': ('')
// after S[4] = 'g', heads = 'p': (''), 't': (''), found '' in g, so res++

11
Reply
Share
Report
edaengineer's avatar
Read More

@awice

Can you explain how complexity is O(S.length+∑​i​​words[i].length), not O( S.size() * #of words) ?

The space complexity seems O(# of words). How could pointer based implementation give O(1) space?

Time Complexity: O(words.length∗S.length+∑iwords[i].length)O(\text{words.length} * S\text{.length} + \sum_i \text{words[i].length})O(words.length∗S.length+∑​i​​words[i].length). For each word, our subseq check on word words[i] may take time complexity O(S.length+words[i].length)O(S\text{.length} + \text{words[i].length})O(S.length+words[i].length).

Isn't for each word, subseq check on word words[i] has time complexity O(S.length). Why O(S.length+words[i].length) ?

9
Show 3 replies
Reply
Share
Report
spetz911's avatar
Read More

Here is a simplified version of this solutions without messing with iterators and ord/chr:

class Solution:
    
    def numMatchingSubseq(self, S: str, words: List[str]) -> int:
        n_subsequences = 0
        state = defaultdict(list)
        for word in words:
            state[word[0]].append(word[1:])
        for char in S:
            old_bucket = state[char]
            state[char] = []
            for suffix in old_bucket:
                if not suffix:
                    n_subsequences += 1
                else:
                    state[suffix[0]].append(suffix[1:])
        return n_subsequences

6
Show 1 reply
Reply
Share
Report
ThejusSinghJ's avatar
Read More

I fell both have similar complexity

4
Reply
Share
Report
rxrxrx's avatar
Read More

Shouldn't the time complexity for Approach#2 be O(s.length() * words.length)? Considering the case where all words are the same: outer for loop takes s.length(), inner for loop takes words.length.

2
Show 1 reply
Reply
Share
Report
aleksey12345's avatar
Read More

Why not to use Trie instead of buckets ?

3
Show 1 reply
Reply
Share
Report
elif1's avatar
Read More

could someone please help me understand the time complexity for the second solution?
Why are we adding the terms in the big O and not multiplying them? I got
O(|S| * #words) - can someone help?

1
Reply
Share
Report
velvetrevolver2001's avatar
Read More

there's no reason to do old_bucket.clear(), removing that will improve performance

1
Reply
Share
Report
park29's avatar
Read More

This solution is really good! Many thanks

1
Reply
Share
Report
Time Submitted
Status
Runtime
Memory
Language
06/23/2021 06:32Accepted172 ms48.1 MBcpp
06/15/2021 12:00Accepted180 ms48 MBcpp
05/25/2021 20:07Accepted168 ms48 MBcpp
05/25/2021 20:02Time Limit ExceededN/AN/Acpp
06/23/2020 16:18Accepted412 ms44.1 MBcpp
06/23/2020 16:16Wrong AnswerN/AN/Acpp
/23
Your previous code was restored from your local storage.Reset to default
Contribute
|||
Saved
Trie
This list is empty.